{
 "cells": [
  {
   "cell_type": "markdown",
   "metadata": {
    "slideshow": {
     "slide_type": "slide"
    }
   },
   "source": [
    "# Best Practices for Scientific Computing"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {
    "slideshow": {
     "slide_type": "subslide"
    }
   },
   "source": [
    "Summary of Best Practices, according to [[1]](http://dx.doi.org/10.1371/journal.pbio.1001745):\n",
    "\n",
    " 1. Write programs for people, not computers.\n",
    " 1. Let the computer do the work.\n",
    " 1. Make incremental changes.\n",
    " 1. Don't repeat yourself (or others).\n",
    " 1. Plan for mistakes.\n",
    " 1. Optimize software only after it works correctly.\n",
    " 1. Document design and purpose, not mechanics.\n",
    " 1. Collaborate."
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "[1]: *Best Practices for Scientific Computing*\n",
    "\n",
    "Greg Wilson, D. A. Aruliah, C. Titus Brown, Neil P. Chue Hong, Matt Davis, Richard T. Guy, Steven H. D. Haddock, Kathryn D. Huff, Ian M. Mitchell, Mark D. Plumbley, Ben Waugh, Ethan P. White, Paul Wilson.\n",
    "\n",
    "PLOS, http://dx.doi.org/10.1371/journal.pbio.1001745"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {
    "slideshow": {
     "slide_type": "slide"
    }
   },
   "source": [
    "# 1. Code Readability"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {
    "slideshow": {
     "slide_type": "subslide"
    }
   },
   "source": [
    "## The Zen of Python\n",
    "From [PEP 20](https://www.python.org/dev/peps/pep-0020/)\n",
    "\n",
    "- Beautiful is better than ugly.\n",
    "- Explicit is better than implicit.\n",
    "- Simple is better than complex.\n",
    "- Complex is better than complicated.\n",
    "- Flat is better than nested.\n",
    "- Sparse is better than dense.\n",
    "- **Readability counts.**\n",
    "- Special cases aren't special enough to break the rules.\n",
    "- Although practicality beats purity.\n",
    "- Errors should never pass silently.\n",
    "- Unless explicitly silenced.\n",
    "- In the face of ambiguity, refuse the temptation to guess.\n",
    "- There should be one-- and preferably only one --obvious way to do it.\n",
    "- Although that way may not be obvious at first unless you're Dutch.\n",
    "- Now is better than never.\n",
    "- Although never is often better than *right* now.\n",
    "- If the implementation is hard to explain, it's a bad idea.\n",
    "- If the implementation is easy to explain, it may be a good idea.\n",
    "- Namespaces are one honking great idea -- let's do more of those!"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {
    "slideshow": {
     "slide_type": "subslide"
    }
   },
   "source": [
    "## Variable/function names\n",
    " \n",
    "Scientists often just translate their equations into code:\n",
    " \n",
    "Bad:\n",
    "```python\n",
    "p0 = 3.5\n",
    "p = p0 * np.cos(0.4 * x - 13.2 * t)\n",
    "```"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {
    "slideshow": {
     "slide_type": "fragment"
    }
   },
   "source": [
    "Better:\n",
    "```python\n",
    "# Constants:\n",
    "base_pressure = 3.5  # Pa\n",
    "wave_length = 15.7   # m\n",
    "wave_number = 2 * np.pi / wave_length  # m-1\n",
    "angular_frequency = 13.2  # Hz\n",
    "\n",
    "# Calculate:\n",
    "pressure = base_pressure * np.cos(wave_number * x - angular_frequency * t)\n",
    "```"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {
    "slideshow": {
     "slide_type": "fragment"
    }
   },
   "source": [
    "*If you have to sacrifice readability for performance reasons, make sure you add comments to explain the code.*"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {
    "slideshow": {
     "slide_type": "subslide"
    }
   },
   "source": [
    "## Coding style.\n",
    "\n",
    "This will vary from project to project, but a recommendation for a \"default\" style exists for Python, the [PEP 8](https://www.python.org/dev/peps/pep-0008/) (PEP = Python Enchancement Proposal). Several editors have functionality to give you hints for improving your style."
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {
    "slideshow": {
     "slide_type": "slide"
    }
   },
   "source": [
    "# 3. Use version control"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {
    "slideshow": {
     "slide_type": "-"
    }
   },
   "source": [
    "Avoid this:\n",
    "\n",
    "    mycode.py\n",
    "    mycode_v2.py\n",
    "    mycode_v2_conference.py\n",
    "    mycode_v3_BROKEN.py\n",
    "    mycode_v3_FIXED.py\n",
    "    mycode_v2+v3.py"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "Instead use a version control system (VCS). There exist many alternatives, but the most common is to use *git* in combination with [github](github.com)."
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {
    "slideshow": {
     "slide_type": "subslide"
    }
   },
   "source": [
    "## What is version control?"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    " - A system to store history of files/folders.\n",
    " - Typically when you want to store a new version of your files, you make a *commit*.\n",
    " - A folder under version control is normally called a *repository*."
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {
    "slideshow": {
     "slide_type": "subslide"
    }
   },
   "source": [
    "## Why use version control?"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "* Backups: \n",
    "    * Manage changes of files such as scripts, source code, documents, etc\n",
    "    * Store copy of git repository on an external platform (e.g. github)\n",
    "* Organization:  \n",
    "    * Retrieve old versions of files.\n",
    "    * Print history of changes.\n",
    "* Collaboration:\n",
    "   * Useful for programming or writing in teams.\n",
    "   * Programmers work on *copies* of repository files and upload the changes to the official repository.\n",
    "   * Non-conflicting modifications by different team members are merged automatically.\n",
    "   * Conflicting modifications are detected and must be resolved manually."
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {
    "slideshow": {
     "slide_type": "subslide"
    }
   },
   "source": [
    "## How does it work?"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    " - A simple repository will normally be a linear series of commits.\n",
    " - However, changes can also branch off, and be merged together."
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {
    "slideshow": {
     "slide_type": "subslide"
    }
   },
   "source": [
    "## Commit flow\n",
    "\n",
    "![Git commit flowchart](../reference/git-commits.svg)"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {
    "slideshow": {
     "slide_type": "subslide"
    }
   },
   "source": [
    "## Branch flow\n",
    "\n",
    "![Git branch flowchart](../reference/git-branches.svg)"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {
    "slideshow": {
     "slide_type": "subslide"
    }
   },
   "source": [
    "## Non-software use\n",
    "\n",
    "Version control can be used for more than just making software!"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "from IPython.display import YouTubeVideo\n",
    "YouTubeVideo('kM9zcfRtOqo')"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {
    "slideshow": {
     "slide_type": "subslide"
    }
   },
   "source": [
    "## How do I use it?\n",
    "\n",
    "If you want to use git, the base program is terminal based, and to unlock the most powerful features of git you will have to use it in this way. However, if you are just starting out, it might be a good idea to start out with a graphical interface, at least until you get the hang of the git flow."
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "Suggested graphical UIs:\n",
    " - Windows/Mac: [GitHub Desktop](https://desktop.github.com/)\n",
    " - Linux: [SmartGit](https://www.syntevo.com/smartgit/), [Git-cola](https://git-cola.github.io/) or [Gitg](https://wiki.gnome.org/Apps/Gitg/)"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {
    "slideshow": {
     "slide_type": "slide"
    }
   },
   "source": [
    "# 5. Use tools to automate testing"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {
    "slideshow": {
     "slide_type": "fragment"
    }
   },
   "source": [
    "You will make mistakes when programming! Everyone does. The trick is in spotting the mistakes early and efficiently (after release/publication is normally undesirable)."
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "Best tool available: Unit tests"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {
    "slideshow": {
     "slide_type": "subslide"
    }
   },
   "source": [
    "## How to write unit tests"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    " 1. Identify a *unit* in your program that should have a well defined behavior given a certain input. A unit can be a:\n",
    "   - function\n",
    "   - module\n",
    "   - entire program\n",
    " 1. Write a test function that calls this input and checks that the output/behavior is as expected\n",
    " 1. The more the better! Preferably on several levels (function/module/program)."
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {
    "slideshow": {
     "slide_type": "subslide"
    }
   },
   "source": [
    "## How to write unit tests in Python"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "Use a test framework like [py.test](http://docs.pytest.org/en/latest/) or [nose](https://nose.readthedocs.io/en/latest/). Several other frameworks exist as well."
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "```bash\n",
    "$ pip install pytest\n",
    "```"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {
    "slideshow": {
     "slide_type": "fragment"
    }
   },
   "source": [
    "Make a file `test_<unit or module name>.py`, preferably in a folder called `tests`."
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {
    "slideshow": {
     "slide_type": "fragment"
    }
   },
   "source": [
    "Import the code to be tested."
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {
    "slideshow": {
     "slide_type": "fragment"
    }
   },
   "source": [
    "Add a function `def test_<test name>():`, and have it check for the correct behavior given a certain input"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {
    "slideshow": {
     "slide_type": "subslide"
    }
   },
   "outputs": [],
   "source": [
    "%matplotlib inline\n",
    "import numpy as np\n",
    "import matplotlib.pyplot as plt\n",
    "\n",
    "from mandlebrot import mandelbrot\n",
    "\n",
    "x = np.linspace(-2.25, 0.75, 500)\n",
    "y = np.linspace(-1.25, 1.25, 500)\n",
    "output = mandelbrot(x, y, 200, False)\n",
    "\n",
    "plt.imshow(output)"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {
    "collapsed": true,
    "jupyter": {
     "outputs_hidden": true
    },
    "slideshow": {
     "slide_type": "subslide"
    }
   },
   "outputs": [],
   "source": [
    "import numpy as np\n",
    "from mandlebrot import mandelbrot\n",
    "\n",
    "def test_mandelbrot_small():\n",
    "    x = np.linspace(-2.25, 0.75, 10)\n",
    "    y = np.linspace(-1.25, 1.25, 10)\n",
    "    output = mandelbrot(x, y, 100, False)\n",
    "    assert output.shape == (10, 10)\n",
    "\n",
    "def test_mandelbrot_zero_outside():\n",
    "    # The Mandelbrot set should be zero outside the \"active\" area\n",
    "    x = np.linspace(1.5, 2.0, 10)\n",
    "    y = np.linspace(1.5, 2.0, 10)\n",
    "    output = mandelbrot(x, y, 100, False)\n",
    "    assert np.all(output == 0.0)\n",
    "\n",
    "def test_mandelbrot_incorrect_test():\n",
    "    x = np.linspace(-1.5, -2.0, 10)\n",
    "    y = np.linspace(-1.25, 1.25, 10)\n",
    "    output = mandelbrot(x, y, 100, False)\n",
    "    assert np.all(output == 0.0)\n",
    "  "
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {
    "collapsed": true,
    "jupyter": {
     "outputs_hidden": true
    },
    "slideshow": {
     "slide_type": "subslide"
    }
   },
   "outputs": [],
   "source": [
    "test_mandelbrot_small()"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {
    "collapsed": true,
    "jupyter": {
     "outputs_hidden": true
    }
   },
   "outputs": [],
   "source": [
    "test_mandelbrot_zero_outside()"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "test_mandelbrot_incorrect_test()"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {
    "slideshow": {
     "slide_type": "subslide"
    }
   },
   "source": [
    "## How to run tests"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "Call `py.test` or `nosetests` in the folder containing your project. The tools will look for anything that looks like a test (e.g. `test_*()` functions in `test_*.py` files) in your project."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "!py.test"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {
    "slideshow": {
     "slide_type": "subslide"
    }
   },
   "source": [
    "## Utilities to express expected behavior"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "While `assert` and comparisons operators like `==`, `>` etc. should be able to express most expected behaviors, there are a few utilities that can greatly simplify tests, a few of these include:"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {
    "slideshow": {
     "slide_type": "-"
    }
   },
   "source": [
    "### Numpy tests"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "small_noise = 1e-6 * np.random.rand(10, 10)\n",
    "np.testing.assert_almost_equal(small_noise, 0.0, decimal=5)  # OK\n",
    "np.testing.assert_almost_equal(small_noise, 0.0, decimal=7)  # Will likely fail"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {
    "slideshow": {
     "slide_type": "slide"
    }
   },
   "source": [
    "# Python modules"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {
    "slideshow": {
     "slide_type": "subslide"
    }
   },
   "source": [
    "## Local modules"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    " - Previously, we have only written code to scripts or cells in Notebooks.\n",
    " - All python files can also be treated as modules, i.e. they can be imported in other Python files / Notebooks.\n",
    " - However, compared to scripts, modules typically only contain definitions (functions/classes/constants) meant to be imported somewhere else"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {
    "slideshow": {
     "slide_type": "subslide"
    }
   },
   "source": [
    "Given a file `mymodule.py` with a function `my_function()`:"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "%%writefile mymodule.py\n",
    "\n",
    "def my_function():\n",
    "    return \"Surprise!\"\n"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "The function can then be imported in another file or a Notebook:"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "from mymodule import my_function\n",
    "\n",
    "my_function()"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {
    "slideshow": {
     "slide_type": "subslide"
    }
   },
   "source": [
    "## Packages"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "Modules and scripts can be organized into packages. These can then easily be distributed and imported by name. A set of built-in packages are included in Python, like:\n",
    "* **sys** System specific functionality\n",
    "* **os** Operating system specific functionality"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "Other packages can be downloaded and installed, e.g.\n",
    "* **scipy** Scientific Python (www.scipy.org)\n",
    "    * **numpy** Numerical Python\n",
    "    * **ipython** Interactive Python\n",
    "    * **matplotlib** Plotting\n",
    "    * **pandas** Data analsyis\n",
    "\n",
    "*Several useful packages are included in Python distributions like Anaconda*"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {
    "slideshow": {
     "slide_type": "subslide"
    }
   },
   "source": [
    "## The Python Package Index (PyPI) collects a large number of modules\n",
    "\n",
    "Official webpage https://pypi.python.org/pypi\n",
    "\n",
    "Search the Python index with\n",
    "```bash\n",
    "pip search KEYWORD \n",
    "```\n",
    "Install any new package with\n",
    "```bash\n",
    "pip install PACKAGENAME --user\n",
    "````"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {
    "slideshow": {
     "slide_type": "slide"
    }
   },
   "source": [
    "# Final points"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    " - Before publishing/release: Get your code to run on another computer! Preferably on another platform.\n",
    " - When unit tests and version control is combined you can have tests be run automatically each time changes are pushed to the repository. This is called *continuous integration*."
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {
    "slideshow": {
     "slide_type": "slide"
    }
   },
   "source": [
    "# Challenges"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    " - Take your code from the previous heat equation diffusion task and make it into a module.\n",
    "   - Put the reference implementation in one submodule, and the plotting code in a script that can be called.\n",
    " - Add it to a git repository.\n",
    " - Add unit tests that checks that the diffusion algorthim behaves as expected. Commit it to the repository.\n",
    " - Replace the Python code with the vectorized version, and commit that to the repository. Check that the vectorized version passes the test!\n",
    " - [Bonus] Turn your module into a package, and expose the callable script as a console entry point."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": []
  }
 ],
 "metadata": {
  "kernelspec": {
   "display_name": "Python 3",
   "language": "python",
   "name": "python3"
  },
  "language_info": {
   "codemirror_mode": {
    "name": "ipython",
    "version": 3
   },
   "file_extension": ".py",
   "mimetype": "text/x-python",
   "name": "python",
   "nbconvert_exporter": "python",
   "pygments_lexer": "ipython3",
   "version": "3.7.7"
  },
  "widgets": {
   "application/vnd.jupyter.widget-state+json": {
    "state": {},
    "version_major": 2,
    "version_minor": 0
   }
  }
 },
 "nbformat": 4,
 "nbformat_minor": 4
}
